Response to May 27, 2004 Office Action 
Application No. 09/705,069 
Page 2 



IN THE CLAIMS 

Please substitute claims 1-56 with the following; 
1-2. (Cancelled). 

3. (Currently Amended) A method for classifying signals comprising; 
dividing an input signal into blocks having a predetermined time length; 

extracting one or more than one characteristic quantities of a signal attribute from the 
signal of each block; and 

classifying the signal of each block into a category according to the characteristic 
quantities thereof, wh e rein said signal of e ach block is classifi e d into any of th e cat e gori e s 
form e d on th e basis of typ e s of signal sourc e s, and wherein said signal of each block is classified 
into any of the categories formed on the basis of types structures that signals may have and do 
not depend on the types of signal sources. 

4. (Currently Amended) The method for classifying signals according to claim [[3]] 
51, wherein 

said input signal is an audio signal; and 

the categories formed on the basis of the signal sources for classifying the audio signal of 
each block include one or more than one of silence, voice, male voice, female voice, music, 
vocal music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources. 
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5. (Original) The method for classifying signals according to claim 3, wherein 
said input signal is an audio signal; and 

the categories formed on the basis of structures that signals may have and do not depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exists in the block, a double 
sound source structure where sounds related respectively to two sound sources exist in the block, 
a sound source change structure where a sound source including silence is switched only for one 
in the block, a multiple sound source change structure where a plurality of sound sources are 
switched simultaneously in the block, a sound source partial change structure where part of a 
plurality of sound sources are switched in the block and an extra structure pattern where none of 
the above patterns is applicable and are used for categorical classification based on the 
structures. 

6. (Previously Presented) The method for classifying signals according to claim 3, 
wherein one or more than one of the average and variances of the signal power in the block, the 
average and variances of the power of a band-pass signal of the signal in the block, the average 
and variances of the spread of the spectrogram of the signal in the block, the average and 
variances of the pitch frequency of the signal in the block, the average and variances of the 
degree of harmonic structurization of the signal in the block, the average and variances of the 
residue signal of linear predictive analysis of the signal in the block and the average and 
variances of the pitch gain of the residue signal of linear predictive analysis of the signal in the 
block are used as said characteristic quantities. 
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7. (Original) The method for classifying signals according to claim 6 3 wherein 

said average of the degree of harmonic structurization is the temporal average of the ratio 
of the energy of the sound component of integer times of the pitch of the frequency to the energy 
of all the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 

8. (Previously Presented) The method for classifying signals according to claim 3, 
wherein a vector quantization technique is used as a method for the categorical classification. 

9-10. (Cancelled). 

1 1. (Currently Amended) An apparatus for classifying signals comprising: 

a blocking means for dividing an input signal into blocks having a predetermined time 

length. 

a feature extracting means for extracting one or more than one characteristic quantities of 
a signal attribute from the signal of each block; and 

a categorical classifying means for classifying the signal of each block into a category 
according to the characteristic quantities thereof, wh e r e in s aid cat e gorical classifying m e ans 
classifi es said signal of e ach block into any of th e cat e gori e s form e d on th e basis of typ e s of 
signal sourc e s, and wherein said categorical classifying means classifies said signal of each block 
into any of the categories formed on the basis of types of structures that signals may have and do 
not depend on the types of signal sources. 
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12. (Currently Amended) The apparatus for classifying signals according to 
claim [[11]] 52, wherein 

said input signal is an audio signal; and 

the categories formed on the basis of signal sources for classifying the audio signal of 
each block include one or more than of silence, voice, male voice, female voice, music, vocal 
music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources. 

13. (Original) The apparatus for classifying signals according to claim 1 1, wherein 
said input signal is an audio signal; and 

the categories formed on the basis of structures that signals may have and do not depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exist in the block, a double 
sound source structure where sounds related respectively to two sound sources exist in the block, 
a sound source change structure where a sound source including silence is switched only for 
once in the block, a multiple sound source change structure where a plurality of sound sources 
are switched simultaneously in the block, a sound source partial change structure where part of a 
plurality of sound sources are switched in the block and an extra structure pattern where none of 
the above patterns is applicable and are used for categorical classification based on the 
structures. 
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14. (Previously Presented) The apparatus for classifying signals according to 
claim 11, wherein said feature extracting means uses one or more than one of the average and 
variances of the signal power in the block, the average and variances of the power of a band-pass 
signal of the signal in the block, the average and variances of the spread of the spectrogram of 
the signal in the block, the average and variances of the pitch frequency of the signal in the 
block, the average and variances of the degree of harmonic structurization of the signal in the 
block, the average and variances of the residue signal of linear predictive analysis of the signal in 
the block and the average and variances of the pitch gain of the residue signal of linear predictive 
analysis of the signal in the block as said characteristic quantities. 

15. (Original) The apparatus for classifying signals according to claim 14, wherein 
said average of the degree of harmonic structurization is the temporal average of the ratio 

of the energy of the sound component of integer times of the pitch frequency to the energy of all 
the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 

16. (Previously Presented) The apparatus for classifying signals according to 
claim 11, wherein said categorical classifying means uses a vector quantization technique as 
method for the categorical classification. 

17-18. (Cancelled). 
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19. (Currently Amended) A method for generating descriptors comprising: 
dividing an input signal into blocks having a predetermined time length; 

extracting one ore more than one characteristic quantities of a signal attribute from the 
signal of each block; 

classifying the signal of each block into a category according to the characteristic 
quantities thereof, wh e r e in said signal of e ach block is classifi e d into any of th e categori e s 
form e d on th e basis of typ e s of signal sourc e s, and wherein said signal of each block is classified 
into any of the categories formed on the basis of types of structures that signals may have and do 
not depend on the types of signal sources; and 

generating a descriptor for the signal according to the category of classification thereof. 

20. (Currently Amended) The method for generating descriptors according to 
claim [[19]] 53, wherein 

said output signal is an audio signal; and 

the categories formed on the basis of signal sources for classifying the audio signal of 
each block include one or more than one of silence, voice, male voice, female voice, music, 
vocal music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources. 

21. (Original) The method for generating descriptors according to claim 19, wherein 
said input signal is an audio signal; 

the categories formed on the basis of structures that signals may have and do no depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
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than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exists in the block, a double 
sound source structure where sounds related respectively to two sound sources exists in the 
block, a sound source change structure where a sound source including silence is switched only 
for once in the block, a multiple sound source change structure where a plurality of sound 
sources are switched simultaneously in the block, a sound source partial change structure where 
part of a plurality of sound sources are switched in the block and an extra structure pattern where 
none of the above patterns is applicable and are used for categorical classification based on the 
structures; and 

a descriptor is generating according to the categorical classification based on the 
structures. 

22. (Previously Presented) The method for generating descriptors according to 
claim 19, wherein one or more than one of the average and variances of the signal power in the 
block, the average and variances of the power of a band-pass signal of the signal of the block, the 
average and variances of the spread of the spectrogram of the signal in the block, the average and 
variances of the degree of harmonic structurization of the signal in the block, the average and 
variances of the residue signal of linear predictive analysis of the signal in the block and the 
average and variances of the pitch gain of the residue signal of linear predictive analysis of the 
signal in the block are used as said characteristic quantities. 
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23. (Original) The method for generating descriptors according to claim 22, wherein 
said average of the degree of harmonic structurization is the temporal average of the ratio 

of the energy of the sound component of integer times of the pitch frequency to the energy of all 
the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 

24. (Previously Presented) The method for generating descriptors according to 
claim 19, wherein a vector quantization technique is used as method for the categorical 
classification. 

25-26. (Cancelled). 

27. (Currently Amended) An apparatus for generating descriptors comprising: 

a blocking means for dividing an input signal into blocks having a predetermined time 

length; 

a feature extracting means for extracting one or more than one characteristic quantities of 
a signal attribute from the signal of each block; 

a categorical classifying means for classifying the signal of each block into a category 
according to the characteristic quantities thereof, wh e r e in said cat e gorical classifying m e ans 
cla ss ifi e s said signal of e ach block into any of th e cat e gori e s form e d on th e basis of typ e s of 
signal sourc e s, and wherein said categorical classifying means classifies said signal of each block 
into any of the categories formed on the basis of types of structures that signals may have and do 
not depend on the types of signal sources; and 
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a descriptor generating means for generating a descriptor for the signal according to the 
category of classification thereof. 

28. (Currently Amended) The apparatus for generating descriptors according to 
claim [[27]] 54, wherein 

said input signal is an audio signal; and 

the categories formed on the basis of signal sources for classifying the audio signal of 
each block include one or more than one of silence, voice, male voice, female voice, vocal 
music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources. 

29. (Original) The apparatus for generating descriptors according to claim 27, 
wherein 

said input signal is an audio signal; 

the categories formed on the basis of structures that signals may have and do not depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exists in the block, a double 
sound source structure where sounds related respectively to two sound sources exist in the block, 
a sound source change structure where a sound source including silence is switched only for 
once in the block, a multiple sound source change structure where a plurality of sound sources 
are switched simultaneously in the block, a sound source partial change structure where part of a 
plurality of sound sources are switched in the block and an extra structure pattern where none of 
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the above patterns is applicable and are used for categorical classification based on the 
structures; and 

said descriptor generating means generates a descriptor according to the categorical 
classification based on the structures. 

30. (Previously Presented) The apparatus for generating descriptors according to 
claim 27, wherein said feature extracting means uses one or more than one of the average and 
variances of the signal power in the block, the average and variances of the power of a band- 
pass-signal of the signal in the block, the average and variances of the spread of the spectrogram 
of the signal in the block, the average and variances of the pitch frequency of the signal in the 
block, the average and variances of the degree of harmonic structurization of the signal in the 
block, the average and variances of the residue signal of linear predictive analysis of the signal in 
the block and the average and variances of the pitch gain of the residue signal of linear predictive 
analysis of the signal in the block as said characteristic quantities. 

31. (Original) The apparatus for generating descriptors according to claim 30, 
wherein 

said average of the degree of harmonic structurization is the temporal average of the ratio 
of the energy of the sound component of integer times of the pitch frequency to the energy of all 
the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 
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32. (Previously Presented) The apparatus for generating descriptors according to 
claim 27, wherein said categorical classifying means uses a vector quantization technique as 
method for the categorical classification. 

33-34. (Cancelled). 

35. (Currently Amended) A method for retrieving signals comprising: 
dividing an input signal into blocks having a predetermined time length; 

extracting one or more than one characteristic quantities of a signal attribute from the 
signal of each block; 

classifying the signal of each block into a category according to the characteristic 
quantities thereof, wh e r e in said signal of e ach block is classifi e d into any of th e cat e gori e s 
form e d on th e basis of typ e s of signal sourc e s, and wherein said signal of each block is classified 
into any of the categories formed on the basis of types of structures that signals may have and do 
not depend on the types of signal sources; and 

retrieving the signal according to the result of categorical classification or by using a 
descriptor generated according to the result of categorical classification. 

36. (Currently Amended) The method for retrieving signals according to claim [[35]] 
55 , wherein 

said input signal is an audio signal; 

the categories formed on the basis of signal sources for classifying the audio signal of 
each block include one or more than one of silence, voice, male voice, female voice, music, 
vocal music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
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bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources; and 

a signal is retrieved by using the descriptor reflecting or corresponding to the result of 
said categorical classification based on the sound sources. 

37. (Original) The method for retrieving signals according to claim 35, wherein 
said input signal is an audio signal; 

the categories formed on the basis of structures that signals may have and do not depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exists in the block, a double 
sound source structure where sounds related respectively to two sound sources exist in the block, 
a sound source change structure where a sound source including silence is switched only for 
once in the block, a multiple sound source change structure where a plurality of sound sources 
are switched simultaneously in the block, a sound source partial change structure where part of a 
plurality of sound sources are switched in the block and an extra structure pattern where none of 
the above patterns is applicable and are used for categorical classification based on the 
structures; and 

a signal is retrieved by using the descriptor reflecting or corresponding to the result of 
said categorical classification based on the structure. 

38. (Previously Presented) The method for retrieving signals according to claim 35, 
wherein one or more than one of the average and variances of the signal power in the block, the 
average and variances of the power of a band-pass signal of the signal in the block, the average 
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and variances of the spread of the spectrogram of the signal in the block, the average and 
variances of the pitch frequency of the signal in the block, the average and variances of the 
degree of harmonic structurization of the signal in the block, the average and variances of the 
residue signal of linear predictive analysis of the signal in the block and the average and 
variances of the pitch gain of the residue signal of linear predictive analysis of the signal in the 
block are used as said characteristic quantities. 

39. (Original) The method for retrieving signals according to claim 38, wherein 

said average of the degree of harmonic structurization is the temporal average of the ratio 
of the energy of the sound component of integer times of the pitch frequency to the energy of all 
the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 

40. (Previously Presented) The method for retrieving signals according to claim 35, 
wherein a vector quantization technique is used as method for the categorical classification. 

41. (Previously Presented) The method for retrieving signals according to claim 35, 
wherein points of changes of the signal are detected by using the descriptor reflecting or 
corresponding to the result of said categorical classification. 

42-43. (Cancelled). 

44. (Currently Amended) An apparatus for retrieving signals comprising: 

a blocking means for dividing an input signal into blocks having a predetermined time 

length; 
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a feature extracting means for extracting one or more than one characteristic quantities of 
a signal attribute from the signal of each block; 

a categorical classifying means for classifying the signal of each block into a category 
according to the characteristic quantities thereof, wh e r e in said cat e gorical classifying m e ans 
classifi e s said signal of e ach block into any of the categori e s form e d on th e basis of typ e s of 
signal sourc e s, and wherein said categorical classifying means classifies said signal of each block 
into any of the categories formed on the basis of types of structures that signals may have and do 
not depend on the types of signal sources; and 

a signal retrieving means for retrieving the signal according to the result of categorical 
classification or by using a descriptor generated according to the result of categorical 
classification. 

45. (Currently Amended) The apparatus for retrieving signals according to 
claim [[44]] 56, wherein 

said input signal is an audio signal; 

the categories formed on the basis of signal sources for classifying the audio signal of 
each block include one or more than one of silence, voice, male voice, female voice, music, 
vocal music, instrumental music, noise, striking sound, environmental sound, sound of hustle and 
bustle, clapping sound and cheering sound and are used for categorical classification based on 
the sound sources; and said signal retrieving means retrieves a signal by using the descriptor 
reflecting or corresponding to the results of said categorical classification based on the sound 
sources. 
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46. (Original) The apparatus for retrieving signals according to claim 44, wherein 
said input signal is an audio signal; 

the categories formed on the basis of structures that signals may have and do not depend 
on the types of signal sources for classifying the audio signal of each block include one or more 
than one of a silence structure where no significant sound exists in the block, a single sound 
source structure where only a sound related to a single sound source exists in the block, a double 
sound source structure where sounds related respectively to two sound sources exist in the block, 
a sound source change structure where a sound source including silence is switched only for 
once in the block, a multiple sound source change structure where a plurality of sound sources 
are switched simultaneously in the block, a sound source partial change structure where part of a 
plurality of sound sources are switched in the bock and an extra structure pattern where none of 
the above patterns is applicable and are used for categorical classification based on the 
structures; and 

said signal retrieving means retrieves a signal by using the descriptor reflecting or 
corresponding to the result of said categorical classification based on the structure. 

47. (Previously Presented) The apparatus for retrieving signals according to claim 44, 
wherein said feature extracting means uses one or more than one of the average and variances of 
the signal power in the block, the average and variances of the power of a band-pass signal of the 
signal in the block, the average and variances of the spread of the spectrogram of the signal in 
the block, the average and variances of the pitch frequency of the signal in the block, the average 
and variances of the degree of harmonic structurization of the signal in the block, the average 
and variances of the residue signal of linear predictive analysis of the signal in the block and the 

11732140W-] 



Response to May 27, 2004 Office Action 
Application No. 09/705,069 
Page 17 

average and variances of the pitch gain of the residue signal of linear predictive analysis of the 
signal in the block as said characteristic quantities. 

48. (Original) The apparatus for retrieving signals according to claim 47, wherein 
said average of the degree of harmonic structurization is the temporal average of the ratio 

of the energy of the sound component of integer times of the pitch frequency to the energy of all 
the frequencies; and 

said variances of the degree of harmonic structurization is the temporal standard 
deviation of the ratio of the energy of the sound component of integer times of the pitch 
frequency to the energy of all the frequencies. 

49. (Previously Presented) The apparatus for retrieving signals according to claim 44, 
wherein said categorical classifying means uses a vector quantization technique as method for 
the categorical classification. 

50. (Previously Presented) The apparatus for retrieving signals according to claim 44, 
wherein said signal retrieving means detects points of changes of the signal by using the 
descriptor reflecting or corresponding to the result of said categorical classification. 

51. (New) The method for classifying signals according to claim 3, wherein said 
signal of each block is classified into any of the categories formed on the basis of types of signal 
sources. 

52. (New) The apparatus for classifying signals according to claim 11, wherein said 
categorical classifying means classifies said signal of each block into any of the categories 
formed on the basis of types of signal sources. 
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53. (New) The method for generating descriptors according to claim 19, wherein said 
signal of each block is classified into any of the categories formed on the basis of types of signal 
sources. 

54. (New) The apparatus for generating descriptors according to claim 27, wherein 
said categorical classifying means classifies said signal of each block into any of the categories 
formed on the basis of types of signal sources. 

55. (New) The method for retrieving signals according to claim 35, wherein said 
signal of each block is classified into any of the categories formed on the basis of types of signal 
sources. 

56. (New) The apparatus for retrieving signals according to claim 44, wherein said 
categorical classifying means classifies said signal of each block into any of the categories 
formed on the basis of types of signal sources. 
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